RabbitMQ - Disk Persistence
RabbitMQ is the one of the oldest messaging brokers out there in the wild - thus his code has evolved a lot in the past 10 years - within this evolution their persistence mechanism were heavily altered. From heavily memory-based to persisting messages directly to disk, the disk persistence changed through queue implementations and message store versions.
(This podcast w/ Michael Klisisn and --- goes through some of the background history of Rabbit.) 10 Years of Rabbitmq - Spotify
Note: We are talking here about durable queues and persistent messages , ie messages that are not lost when a node/broker restarts
Classic Queues Persistence
Rabbit Classic Queues V1 #classicqueues - were designed on a time were SSDs were not the cheapest resource - so frequent disk IO ops were way more costly and slow. This led the messages to be stored entirely in-memory and only flushing some big messages.
As resources got cheaper, disk I/O became way cheaper - thus Lazy Queues were born. A new mode
of queue which wrote messages into disk much earlier and frequently than the old approach - and since RabbitMQ is really memory sensitive due to #queuemessageindexes it heavily improved memory profile, overall throughput and resiliency whenever there's any message backlog.
Currently - as of today 3.13x - all Classic V1 operates almost like lazy
queues.
What is the difference storage difference in Classic Queues V1 and V2?
The messages are pretty differently stored between version - since V1 was more memory-based it kept a bigger portion of messages in-memory. Besides it, when V1 was a thing we only had a single message storage(2 actually but works as one) for the entire broker.
Classic V2 has different storage implementation, both at the queue index and at the message store.
This video from RMQ team showcases this has a lot of insight into this. (RMQ Team showing lazy, v2 and v1 queues)
Message Store Internally
On #rabbitmq/rabbitmq-server repository, we can check the file rabbit_msg_store
which the logic to interact w/ the persistence layer.
RMQ stores messages into (default to, but configurable via ...) 16MB *.rqd
files - these files are called #segmentfiles, which are basically a bunch of aggregated messages with a ref count
- ie, the amount of references through the queues.
Example:
We publish a message to a fanout exchange with 10 bindings - Thus 10 queues receives the messages. Rabbit only writes 1 message into disk - with a ref count of 10. As messages are acknowledged by a queue, the ref count gets decremented. As soon as the `ref count` reaches 0, it gets elected to be garbage collected.
For the message storage to work properly, Rabbit has a couple of data structures and ETS Tables.
- Message Location
- Fields: -..-
- This DS is responsible to keep information about where the message is located in the disk ??
- Is used in the msg store index
- FileSummary
- A data strucuture that
- MessageStoreState (msstate) & ClientMessageStoreStage(client_msstate)
- These guys keep the state of the message store
RabbitMQ Disk GC
ref(rabbit_msg_store
from line 1654 - 1900) #diskgarbagecollection #diskgc
Rabbit has a disk garbage collecting mechanism on rabbit_msg_store_gc
which truncates #segmentfiles whenever 50% of the ref count
is 0 - within 2 files. (add reference here from rabbit-internals)
Rabbit 3.12x
After Rabbit 3.13x
commit: 32816c0a76abf29abdb522befd52e4168f608c16 - rabbitmq-server
The GC entry point is the maybe_gc
function - this functions list all files from the message store, which are not locked, locks them and proceed to rabbit_msg_store_gc:compact
The Disk GC from RabbitMQ has 3 main operations, which are called via rabbit_msg_store_gc
that applies scheduling logic, proxying calls to rabbit_msg_store
- operations are:
- delete
- truncate
- compact
Compact
The compaction algorithm is a simple and naive defragmentation
algo.